Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(snapshot): adding the design doc for mountable snapshots #183

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

wowditi
Copy link

@wowditi wowditi commented Apr 6, 2022

Pull Request template

Please, go through these steps before you submit a PR.

Why is this PR required? What issue does it fix?:

What this PR does?: It adds a design for a new feature

Does this PR require any upgrade changes?: No

If the changes in this PR are manually verified, list down the scenarios covered::

Any additional information for your reviewer? :
In the drawbacks section I posed the question of whether we should maybe use annotation to indicate that we are mounting a snapshot instead of creating a new PV as a clone of the snapshot. I'd like a second opinion on whether this is a good idea or what a good alternative could be (for example, we could create a new api-resource although I don't think that would be such a good idea).
Also a PVC requires the Storage to be set, however, it is a little bit useless in this case especially when mounting snapshots. Any suggestions on handling that better would be appreciated.

Checklist:

  • Fixes #
  • PR Title follows the convention of <type>(<scope>): <subject>
  • Has the change log section been updated?
  • Commit has unit tests
  • Commit has integration tests
  • (Optional) Are upgrade changes included in this PR? If not, mention the issue/PR to track:
  • (Optional) If documentation changes are required, which issue on https://github.com/openebs/openebs-docs is used to track them:

## Drawbacks

- As far as I understand it, normally when creating a PersistentVolumeClaim with a snapshot as datasource it would clone the Snapshot into an actual volume, so we'd be diverging from this behaviour.
- We could use an annotation to specify the behaviour that we are implementing here so that the clone behaviour can be added at a later time?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you mention the annotation and also how we are going to use that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some additional details to the document. Basically we annotate the PVC to indicate whether it needs to clone the snapshot or mount it.

The changes below are the primary modules that need to be modified, however, there are some additional utility modules that will need to be modified to support these changes. Most likely the following files will need some additions to support snapshots: [kubernetes.go](../../pkg/builder/volbuilder/kubernetes.go) (in order to implement a `Kubeclient.GetSnapshot` function), [lvm_util.go](../../pkg/lvm/lvm_util.go) (in order to create a function that changes the snapshot write access) and [mount.go](../../pkg/lvm/mount.go) (in order to support mounting/unmounting of snapshots).

- In the [controller.go](../../pkg/driver/controller.go) the code path in the `CreateVolume` function for the `controller` type that occurs when contentSource.GetSnapshot() is not `nil` needs to be implemented. When this path is triggered it should return the correct `volName`, `topology` and `cntx`.
- In the [agent.go](../../pkg/driver/agent.go) the `NodePublishVolume` function for the `node` type needs to be changed such that it checks whether the `volName` is a snapshot and if so mount the snapshot to the specified location.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you describe how we are going to find which snapshot we have to mount, how do we find that?

Copy link
Author

@wowditi wowditi May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some additional details in the doc as to how this can be done. But it's rather straightforward, the VolumeId (which is should be provided to this function in the request) should be of the format snapshot-<uuid> (or possibly pvc-<original-uuid>@snapshot-<snapshot-uuid>) where the uuid is also the name of the snapshot on the host.

The changes below are the primary modules that need to be modified, however, there are some additional utility modules that will need to be modified to support these changes. Most likely the following files will need some additions to support snapshots: [kubernetes.go](../../pkg/builder/volbuilder/kubernetes.go) (in order to implement a `Kubeclient.GetSnapshot` function), [lvm_util.go](../../pkg/lvm/lvm_util.go) (in order to create a function that changes the snapshot write access) and [mount.go](../../pkg/lvm/mount.go) (in order to support mounting/unmounting of snapshots).

- In the [controller.go](../../pkg/driver/controller.go) the code path in the `CreateVolume` function for the `controller` type that occurs when contentSource.GetSnapshot() is not `nil` needs to be implemented. When this path is triggered it should return the correct `volName`, `topology` and `cntx`.
- In the [agent.go](../../pkg/driver/agent.go) the `NodePublishVolume` function for the `node` type needs to be changed such that it checks whether the `volName` is a snapshot and if so mount the snapshot to the specified location.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to know whether the volName is a snapshot? We are creating a volume out of the snapshot and then simply we want to mount that newly created volume. Am I interpreting it right or am I missing something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured we'd need to call different functions so that we end up with an *apis.LVMSnapshot object instead of a *apis.LVMVolume object, but that might not even be needed. I suppose that's mainly an implementation detail and this should simply say that it mount the snapshot.

In order to implement this only existing code needs to be changed, there is no need for an entirely new control flow.
The changes below are the primary modules that need to be modified, however, there are some additional utility modules that will need to be modified to support these changes. Most likely the following files will need some additions to support snapshots: [kubernetes.go](../../pkg/builder/volbuilder/kubernetes.go) (in order to implement a `Kubeclient.GetSnapshot` function), [lvm_util.go](../../pkg/lvm/lvm_util.go) (in order to create a function that changes the snapshot write access) and [mount.go](../../pkg/lvm/mount.go) (in order to support mounting/unmounting of snapshots).

- In the [controller.go](../../pkg/driver/controller.go) the code path in the `CreateVolume` function for the `controller` type that occurs when contentSource.GetSnapshot() is not `nil` needs to be implemented. When this path is triggered it should return the correct `volName`, `topology` and `cntx`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have anything specific in mind on how to implement this?

Copy link
Author

@wowditi wowditi May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of fact yes, I created a quick PoC a while back to see if this would even be feasible.
The first step is determining the volume name, which can be parsed from the SnapshotId (contentSource.GetSnapshot().GetSnapshotId()). I'd suggest passing snapshot-<uuid> as the volume name, but the entire snapshot_id would also work.
We'd also need to create a function that retrieves the *lvmapi.LVMSnapshot object based on the request (instead of the *lvmapi.LVMVolume object we get after creating a new LVM Volume. The process is quite similar to that of building a regular volume (using the volbuilder.NewBuilder() function). The only information that is not already present is that of the owner, but we can retrieve this by using the existing lvm.GetLVMSnapshot function.
With this information creating the topology and cntx is exactly the same as before only using the *lvmapi.LVMSnapshot instead of the *lvmapi.LVMVolume.

This feels too detailed to be in a design document and more an implementation specific part, but if you disagree I can add it to the design doc.

metadata:
name: csi-lvmpvc-snap
spec:
storageClassName: openebs-lvmpv

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is storage class required to be of local.csi.openebs.io provisioner type? Can we have different storage class consuming the lvm snapshots?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the dataSource is provided separately I don't think it's required. I'm not entirely sure what's involved with creating a new provisioner type, but it would be a good solution to the downside that I mentioned in the design.

@wowditi wowditi force-pushed the add-mountable-snapshot-design-doc branch from 870b703 to abdb065 Compare May 2, 2022 09:37
@Ab-hishek Ab-hishek changed the title <chore>(snapshot): adding the design doc for mountable snapshots chore(snapshot): adding the design doc for mountable snapshots Jun 2, 2022
Copy link

@Ab-hishek Ab-hishek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wowditi. This document looks good. Just a minor comment. Since, this is a design document we really don't need to mention the exact files or function names where the changes need to be done. We can simply mention the goals, non-goals, design change/enhancement, any diagrams if needed. Therefore can you make the necessary changes for it.

@Abhinandan-Purkait
Copy link
Member

@wowditi Can you please provide an update on the plans for taking this forward? Are you willing to work on this?

@orville-wright
Copy link
Contributor

orville-wright commented Mar 29, 2024

Hi @wowditi I run Product Mgmt for the OpenEBS project.
You've put a lot of amazingly impressive & detailed work into your PR and it received a bunch of good reviews from our ENG team. Overall very positive.
LVM-localPV is now getting a major focus in the OpenEBS product as it's being unified into the core platform as a central component, rather than being managed as a separate external Data-Engine.
Additionally, LVM-LocalPV now has ~50,000 Daily Active Users and +150,000 product installations. So it's now a very well validated part of the platform. Hence we're are unifying it into the main product.

We'd really like to work on developing your PR and integrate the design into the new unified product.

Are you still interested and willing to support your PR?

@wowditi
Copy link
Author

wowditi commented Apr 2, 2024

Hi,
Sorry I forgot about this since my company started using the ZFS variant since it already had the features we need.
I do not have a lot of spare time currently, but I could probably help since it wasn't that much work from what I remember.

About the design document, the implementation details are indeed very technical currently and I can make those a bit more high level.
After using the ZFS localpv I also noticed that this is basically the same behaviour as what they refer to as cloning, should I rewrite the document to call it cloning and remove the drawback?

PS: I should probably mention that I mainly code in python and not Golang, so it might not be the most beautiful code.

@dsharma-dc
Copy link
Contributor

I see at least a few things that aren't talked about in this document:

  1. There is an idea in the document that if the PVC for snapshot is created with a capacity(resource) - we could extend the lvm if the defined storage size is larger then that of the snapshot (with a maximum of the size of the original volume). If the snapshot is thick and is extended to a cap of original LV size, it'll become invalid in view of LVM. How do we deal with that during PVC provisioning.
  2. If say there is a volume V1 of size S, a snapshot of V1 is created of size 50% i.e. S/2. Now a PVC P is provisioned with snapshot. If the snapshot is configured to auto-extend, the snapshot size at some point becomes larger than size of P. How does that affect the pvc P ?
  3. If a mounted snapshot becomes invalid (100% of original LV size), how does that impact the pvc which is bound to it?

@dsharma-dc dsharma-dc self-requested a review July 2, 2024 11:23
@dsharma-dc dsharma-dc added the question Further information is requested label Jul 30, 2024
@dsharma-dc
Copy link
Contributor

This hasn't yet progressed based on last set of clarifying questions. Keeping a track of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants